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MAX PLANCK SOCIETY 
for Promotion of the Sciences e.V. [Registered Association) 

37073 Goettingen 
glycerol -i-r.hr, fir v ia1 - e dehvriT-r^ ena s«» ^pp H) 
This invention concerns DNA sequences that code for a glycerol-3- 
Phosphate dehydrogenase (GPDH) and the alleles as well as the derivatives of 
these DNA sequences. 

This invention also concerns genomic clones that contain the complete 
gene of a glycerol -3 -phosphate dehydrogenase and alleles as well as derivatives 
of this gene. 

This invention also concerns promoters and other regulator elements of 
glycerol -3 -phosphate dehydrogenase genes. 

Glycerol-3 -phosphate dehydrogenase (GPDH; EC i.i.x.e), also ^ „ 
dihydroxyacetone phosphate reductase, is substantially involved in 
triglyceride biosynthesis in plants by supplying 

Fatty acid biosynthesis and triglyceride biosynthesis can be regarded as 
-parate biosynthesis pathways owing to compartmentalized but as one 
biosynthesis pathway f ro m the standpoint of the end product. De novo 
biosynthesis of fatty acids takes place in the plastids and is catalyzed by 
three enzymes or enzyme systems, i.e., (i, acetyl-CoA carboxylase (ACCase, , ( 2 , 
fatty acid synthase (FAS) , and (3, acyl- fACP] -thioesterase (TE) . The end 
products of this reaction sequence in most organisms are- either palmitic acid, 
stearic acid, or after desaturation, oleic acid. 

in the cytoplasm, however, triglyceride biosynthesis takes place via 
the so-called -Kennedy pathway- in the endoplasmic reticulum from glycero- 
phosphate which is made available by the activity of glycerol -3 -phosphate 
dehydrogenase (s.A. Pinnlayson eta!., Arch . Biochem . ^ 
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pages 179-185), and from fatty acids present in the form of acyl-CoA 
substrates . 

Probably the first discovery of the enzymatic activity of glycerol-3- 
phosphate dehydrogenase in plants involved potato tubers (G.T. Santora et al., 
Arch. Biochem. Biophys., 125. (1979) pages 403-411). This activity had not been 
observed in other plants before then (B. Konig and E. Heinz, Planta, ixa (1974) 
pages 159-169) , so the existence of the enzyme had not been detected. Thus the 
formation of glycerol -3 -phosphate on the basis of the activity of a glycerol 
kinase was discussed as an alternative biosynthesis pathway. Santora et al., 
loc. cit.. subsequently detected GPDH in spinach leaves and succeeded in 
increasing the concentration of the enzyme approximately 10,000 times. They 
determined the native molecular weight to be 63.5 kDa and found the optimum pH 
for the reduction of dihydroxyacetone phosphate (DHAP) to be 6.8 to 9.5 for the 
back reaction. GPDH was likewise detected in Ricinus endosperm (Finlayson et 
al., Biochem. Biophys. 122 (1980) pages 179-18S) . According to more recent 
works (Gee et al.. Plant Physiol. ££ (1988a) pages 98-103), two GPDH activities 
could be detected in enriched fractions, a cytoplasmic fraction (20-25%) and a 
plastid (75-80%) . The two forms are regulated differently. Thus, for example, 
the cytoplasmic isoform can be activated by F2,6DP, while the plastid isoform 
is activated by thioredoxin (R.W. Gee et al., Plant Physiol., fi£ (1988) pages 
98-103 and R.W. Gee et al., Plant Physiol., £Z (1988) pages 379-383). 

The methods of molecular biology are making increasing entry into plant 
cultivation practice. Changes in biosynthesis output with the formation of new 
components and/or higher yields of these components can be achieved with the 
help of gene manipulation, e.g., transfer of genes which code for enzymes. As 
one of the most important enzymes of triacylglyceride synthesis, GPDH has a 
significant influence on the oil yield of plants. 
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It is thus the object of this invention to improve the oil yield of crop 
plants by influencing the triacylglyceride content. 

This object is achieved with the DMA sequences according to patent claim 
1 and the genes from the genomic clones according to patent claim 4. . 

This invention concerns DNA sequences that code for a glycerol-3- 
phosphate dehydrogenase, and alleles as well as derivatives of these dma 
sequences . 

* 

This invention also concerns genomic clones that contain a complete gene 
of a glycerol -3 -phosphate dehydrogenase including the structure gene, the 
promoter and other regulator sequences, and alleles as well as derivatives of 
this gene. 

This invention likewise concerns the promoters and other regulator 
elements of glycerol -3 -phosphate dehydrogenase genes from the specified genomic 
clones, and the alleles as well as derivatives of these promoters. 

This invention additionally concerns a method of producing plants, plant 
parts and plant products in which the triacylglyceride content or fatty acid 
content is altered, where DNA sequences or genes are transferred from the 
genomic clones by the methods of genetic engineering. 

This invention also concerns the use of said DNA sequences or one of the 
genes originating from said genomic clones for altering the triacylglyceride 
content or its fatty acid pattern in plants. 

■ 

Finally, this invention concerns transgeneic plants, plant parts and 
plant products produced according to the aforementioned method. 
The figures serve to clarify the present invention. 
They show the following: 



3 



2170611 



Figure 1: Comparison of the derived amino acid sequences of the C1GPDH3 0 and 

CLGPDH109 cDNAs as well as the gene from the ClGPDHg3 genomic clone 
with the GPDH amino acid sequence of the mouse (Mm GPDH) ; 

Figure 2: Separation of proteins from BB26-36 cells by gel electrophoresis; 

Figure 3: Map of the .insertions contained in ClGPDHgS, ClGPDHg9 and ClGPDHg3 

* 

genomic clones with various restriction enzymes; 
Figure 4: Schematic diagram of the functional areas of the genes contained in 

the C1GPDH5, C1GPDH9 and C1GPDH3 genomic clones; and 
Figure 5: Northern Blot with RNAs from various plant tissues, hybridized with 

C1GPDH20 cDNA as a probe. 
It is obvious that allelic variants and derivatives of DNA sequences or 
genes according to this invention are included within the scope of this 
invention under the assumption that these modified DNA sequences or modified 
genes will code for glycerol -3 -phosphate dehydrogenase. The allelic variants 
and derivatives include, for example, deletions, substitutions, insertions, 
inversions and additions to DNA sequences or genes according to this invention. 

Any plant material that produces glycerol- 3 -phosphate dehydrogenase in 
sufficient quantities is a suitable raw material for isolating cDNAs that code 
for glycerol -3 -phosphate dehydrogenase. Isolated embryos from the plant Cuphea 
lanceolata, indigenous to Central America, have proven to be an especially 
suitable raw material in the present invention. 

Functional complementation was used for isolation of DNA sequences 
according to this invention. This refers to complementation of mutant 
microorganisms with heterologous cDNA. Functional complementation was 

performed after infecting E. coli strain BB26-36, which is auxotrophic for 
glycerol, with phagemids containing plasmids with cDNAs from Cuphea lanceolata. 
Plasmids isolated from functionally complemented bacteria were cleaved with 
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restriction endonucleases and separated by electrophoresis . The cDNAs 
contained in the plasmids were classified in two classes that differ in the 
size of their insertions. Re trans format ion confirmed that the isolated cDNAs 
were capable of complementing the BB26-36 mutant. 

The complete coding area of one of the two classes codes for a glycerol-- 
3 -phosphate dehydrogenase and is contained in the C16PDH20 cDNA clone. This is 
an Eco RI-Apal fragment that has 1354 base pairs. The complete 1354 base pair 
DNA sequence of the C1GPDH2 0 cDNA and the amino acid sequence derived from it 
are entered in the Sequence Listing as SEQ ID NO:l. C1GPDH20 cDNA was 
sequenced double stranded. Proceeding from the ATG start codon, the cDNA 
codes from positions 17 to 1132 for a protein with 372 amino acids (ending at 
the TAG stop codon) , which is expressed as a fusion with lacZ without a shift 
in the reading frame. The estimated molecular weight is 40.8 kDa. Two base 
pairs (CA) preceding ATG are included with the cDNA. The first 14 nucleotides 
are attributed to the DNA sequence of the fusion with lacZ, and the linker 
sequence is indicated at the 3* end. The polyA signal is found at positions 
1329 to 1334 in the 3 1 untranslated region. 

It is assumed that C1GPDH2 0 cDNA is a cytoplasmic isoform, because no 
transit peptide can be detected in homology comparisons with mouse GPDH (see 
Figure 1) . On the basis of the position of an assumed NADH binding site 
corresponding to the consensus sequence GxGxxG (see positions 29 to 34 in the. 
C1GPDH20 amino acid sequence in Figure 1 (R.K. Wierenga et al., Biochem. 24. 
(198 5) pages 1346-1357) , the N- terminal sequence of 28 amino acids is not 
sufficient to code for a transit peptide whose length varies between 32 and 75 
amino acids (Y. Gavel et al . , FEBS Lett. 2£X (1990) pages 455-458). 

A cDNA library from Cuphea lanceolata was screened with C1GPDH20 cDNA as 
a probe for isolation of additional GPDH cDNAs , and a total of 52 cDNA clones 
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were isolated. - The 18 longest cDNAs were completely or partially sequenced. 
The C1GPDH109, C1GPDH30 and C1GPDH132 cDNA clones contain cDNAs with the 
complete coding region or a virtually complete cDNA of GPDH. 

The C1GPDH109 cDNA clone contains the complete coding region of GPDH on a 
1464 base pair EcoRI-Apal DNA fragment which codes for a protein with 381 amino 
acids . The DNA sequence and the . amino acid sequence derived from it are shown 
as SEQ ID NO: 2 in the Sequence Listing. The DNA fragment was sequenced double 
stranded. The coding area begins with the ATG start codon in position 45 and 
ends in position 1187, followed by the TAG stop codon (positions 1188 to 1190) . 
The cDNA itself begins at position 15. The first 14 nucleotides are attributed 
to the DNA sequence of the fusion with lacZ. The polyA signal (positions 1414 
to 1419) and the polyA area (positions 1446 to 1454) as well as the linker 
sequence (positions 1459 to 1464) are found in the untranslated region at the 
3 1 end . 

Another cDNA, C1GPDH30, also contains the complete coding region of GPDH 
on a 13 90 base pair EcoRI-XhoI fragment, which codes for a protein with 372 
amino acids. The double -stranded- sequenced DNA sequence and the DNA sequence 
derived from it are listed as SEQ ID NO: 4 in the Sequence Listing. The protein 
coding sequence begins with the ATG start codon at position 34 and ends before 
the stop codon at position 114 9. The first 14 base pairs are attributed to the 
sequence of the fusion with lacZ. The polyA signal (positions 1349 to 1354) 
and the polyA region (positions 1366 to 1384) are found in the untranslated 3' 
area . 

The C1GPDH132 cDNA clone with 1490 base pairs is an Eco Rl-Xhol fragment, 
the DNA sequence of which and the amino acid sequence derived from it are shown 
as SEQ ID NO: 3 in the Sequence Listing. The DNA fragment was sequenced double 
stranded. C1GPDH132 cDNA is missing 14 amino acids at the N terminus In 
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comparison with C1GPDH109 cDNA. The open reading frame begins at position 15 
and ends at position 1115, followed by the stop codon at positions 1116 to 
1118. Consequently, C1GPDH132 cDNA codes for a protein with 367 amino acids 
and likewise includes the coding area for glycerol -3 -phosphate dehydrogenase 
with the exception of 14 amino acids. The first 14 nucleotides are to be 
attributed to the lacZ fusion sequence and the linker sequence (positions 1485 
to 1490) is at the 3» end. The polyA signal and the polyA area are located at 
positions 1343 to 1348 and 1465 to 1484, respectively, in the untranslated 3- 
area. 

Two classes of cDNAs can be distinguished on the basis of sequence data. 
Accordingly, C1GPDH20 and C1GPDH3 0 cDNAs belong to class A and C1GPDH132 and 
ClGPDHl 0 9 cDNAs belong to class B. 

As Figure 1 shows, the derived amino acid sequences of C1GPDH30 and 
ClGPDHl 09 cDNAs show 96% identical amino acids. At the same time, the 
derivative amino acid sequences of the cDNAs and those of a gene to be assigned 
to another class, C1GPDH30, were compared with the GPDH amino acid sequence of 
the mouse (MmGPDH) . The differences between the amino acid sequence derived 
from the C1CPDH109 cDNA, the coded amino acid sequence of the gene and the 
mouse GPDH in comparison with the amino acid sequence derived from C1GPDH3 0 are 
shown in black. On the average, the identity of the derivative proteins of the 
cDNAs and the GPDH gen with the mouse protein is approximately 50%. 

C1GPDH20 cDNA was cloned into an expression vector and expressed in E. 
coli as a fusion protein with glutathione-S- transferase . To do so, the cDNA 
was cloned beginning with ATG (see position 17, SEQ ID NO:l) into pGX, a 
derivative of the pGEXKG expression vector (K.L. Guan et al., Analytical 
Biochem. 122 (1991) pages 262-267) . BB26-36 cells were harvested at various 
times after administration of IPTG (isopropyl-b- thiogalcatopyranoside) and 
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their proteins were separated by gel electrophoresis. Figure 2 shows gel 
electrophoretic separation of BB26-36 cell extracts. The left column shows the 
• proteins of cells with the pGX expression vector (without fusion; 26 kDa 
protein, and the right side shows proteins of cells with the pGXGPDH20 
expression vector which codes for a fusion protein of 67 kDa. The hourly 
values given indicate the times of sampling after IPTG induction. This clearly 
shows an enrichment of the fusion protein after two hours. An enzyme activity 
determination was subsequently performed by enzyme assay of GPDH with an 
isolated fusion protein and significant enzyme activity was measured. This 
finding clearly proves that C1GPDH20 cDNA contains a competent gene for 
expression of GPDH. 

* 

Furthermore, genomic clones were isolated, where a library of genomic DNA 

of cuphea lanceolate was screened with C1CPDH20 cDNA as a probe. By this 

method, 31 genomic clones were isolated. The genomic clones contain a complete 

structure gene of a glycerol -3 -phosphate dehydrogenase and alleles plus 

derivatives of this gene together with the promoter sequence and other 

regulator elements. This means that they form complete transcription units. 

Three genomic clones are characterized below. These include 

the ClGPDHg 3 genomic clone with a 15.9 Kb DNA insertion, the ClGPDHgS genomic 

clone with a 17.7 Kb DNA insertion, and the ClGPDHgS genomic clone with a 15.6 

Kb DNA insertion. Figure 3 shows a map of the DNA insertions of the genomic 

clones with various restriction enzymes. The black bars indicate the fragments 

that hybridize with a 5- probe of the GPDH20 cDNA. The white bars show the 

areas of DNA insertions that were sequenced and are included in the Sequence 
Listing. 

Sequence analysis of the areas presented in Figure 3 (white bars) of the 
three genomic clones ClGPDHg5 , ClGPDHg3 and ClGPDHgS has shown that they 
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contain the complete or partial structure gene of GPDH with all or most of the 
promoter sequence (5 - direction) . Figure 4 shows a schematic diagram of the 
• sequenced areas of the genomic clones. The ClGPDHgS, ClGPDHgS • and ClGPDHgS 
genomic clones contain the complete structure genes of GPDH in addition to 
promoter sequences. The entire promoter of GPDH was sequenced from the 
ClGPDHg9 genomic clone. 

Thus a 4434 bp DNA fragment of the ClGPDHgS genomic clone contains parts 
of the promoter and the complete structure gene of GPDH in the S- area. The 
double-stranded-sequenced DMA sequence as well as the amino acid sequence 
derived from it are shown as SEQ ID NO: 5 in the Sequence Listing. The protein- 
coding sequence interrupted by DNA areas not translated (introns) with 372 
amino acids begins with the ATG. start codon in position 1394 and ends before 
the TAG stop codon in position 4005. The putative TATA box is located at 

positions 1332 to 1336. Transcription presumably starts at position 1364 
(Joshi, NAR 23. (1987) pages 6643-6653). The polyA signal is located in 

positions 4205 to 4210 at the 3- end. Position 4221 corresponds to the last 

nucleotide before the polyA area of C1GPDH3 0 cDNA (see position 1365 in SEQ ID 

N0:4) . 

The complete structure gene of GPDH as well as parts of the promoter in 
5- direction are contained in a 4006 bp DNA fragment from the ClGPDHgS genomic 
clone. The DNA sequence of the DNA fragment that was sequenced mostly as a 
double strand from ClGPDHgS as well as the amino acid sequence derived from it 
are shown as SEQ id NO :6a and SEQ ID NO :6b in the Sequence Listing. The 
protein coding area interrupted by intron sequences begins at position 1182 
(see SEQ ID NO: 6a) with the ATG start codon and ends with the TAG stop codon at 
position 190 (see SEQ ID NO:6b). CAAT box and TATA box signal sequences are 
located at positions 1055 to 1058 and 1103-1107 before the start of 
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transcription. Assumed transcription starting points are at positions 1136 and 
1148. Owing to a lack of sequence data, an area of approximately 460 base 
pairs is not identified within the coding sequence. The polyA signal is 
located in the untranslated 3' area at positions 393 to 398 (SEQ ID NO: 6b) . 

The entire promoter as well as the first exon of the sequence coding for 
GPDH are contained in a 1507 bp DNA fragment from the ClGPDHg9 genomic clone. 
The DNA sequence that was sequenced mostly as a double strand as well as the 
amino acid sequence derived from it are shown as SEQ ID NO: 7 in the Sequence 
Listing. The TATA box is located at positions 1108 to 1112 before the start of 
transcription. The protein coding sequence begins with the ATG start codon at 
position 1193 and ends at position 1376, where an untranslated area (intron) 
begins. Transcription presumably starts at position 1144. 

By comparing DNA sequences, it has been found that C1GPDH30 cDNA, which 
includes a complete protein reading frame for GPDH, is identical to the GPDH 
gene from the CIGPDHgS genomic clone. Consequently, the ClGPDHgS genomic clone 
can be classified in class A (see above) . The C1GPDH132 cDNA with an almost 
complete protein reading frame for GPDH is identical to the gene from the 
ClGPDHg9 genomic clone, which consequently may be assigned to class B (see 
above) . The gene from the ClGPDHg3 genomic clone cannot be assigned to either 
of the two classes, and thus forms another class C. 

Genetic engineering methods (in the form of anti-sense expression or 
overexpression) can be used to introduce or transfer the DNA sequences 
according to this invention that code for a glycerol -3 -phosphate dehydrogenase 
into plants for the production of these dehydrogenases for the purpose of 
altering the biosynthesis yield of these plants. Inasmuch as the DNA sequences 
according to this invention are not a complete transcription unit, they are 
preferably introduced into the plants together with suitable promoters. 
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especially in recombinant vectors, such as binary vectors. Genomic clones can 
be used as separate complete transcription units for the transformation of 
plants in order to influence the triacylglyceride content and the fatty acid 
distribution . 

Any species of plants can be transformed for this purpose. Oil-bearing 
plants, such as rapeseed, sunflower, linseed, oil palm and soybean are 
preferred for this transformation in order to influence the triacylglyceride 
biosynthesis in these plants in the manner desired. 

The introduction of DNA sequences according to this invention that code 
for a glycerol -3 -phosphate dehydrogenase as well as the complete genes 
contained in the genomic clones of a glycerol -3 -phosphate dehydrogenase by the 
methods of genetic engineering can be performed with the aid of conventional 
transformation techniques. Such techniques include direct gene transfer, such 
as microinjection, electroporation, use of particle gun, steeping plant parts 

■ 

in DNA solutions, pollen or pollen tube transformation, viral vector-mediated 
transfer and liposome-mediated transfer as well as the transfer of appropriate 
recombinant Ti plasmids or Ri plasmids through Agrobacterium tumef aciens and 
transformation by plant viruses. 

The DNA sequences according to this invention as well as the complete 
genes of a glycerol -3 -phosphate dehydrogenase contained in the genomic clones 
are excellent for achieving a significant increase in oil production by 
transgeneic plants. This increase in oil yield is obtained with an increase in 
triacylglyceride content in of the seed due to overexpression of GPDH. 
Furthermore, a reduction in glycerol -3 -phosphate dehydrogenase can be obtained 
through anti-sense expression or cosuppression, so the building blocks for 
triacylglyceride synthesis are missing. This effect is especially beneficial 
when the production of wax esters (such as jojoba wax esters) in the seeds of 
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transgeneic plants is to be improved. Another possible application of DNA 
sequences according to this invention as well as the genes from the genomic 
clones would be for suppressing triacylglyceride biosynthesis in .transgeneic 
plants and making available the CoA ester as well as glycerol -3 -phosphate for 
other biosyntheses . 

Moreover, the promoters of glycerol -3 -phosphate dehydrogenase genes from 
clones according to this invention can, for example, be used for targeted 
expression of chimeric genes in embryo- specif ic tissue. On the basis of 
experimental data it is assumed with regard to the specificity of the promoters 
that the promoters of genes from the ClGPDHgS and ClGPDHg9 genomic clones are 
seed-specific, while the promoter of the gene from the ClGPDHg3 genomic clone 
has little or no activity in the embryo. Thus, for example, a 1387 bp 
BamHI/AlwNI fragment of ClGPDHgS is suitable for transcriptional fusion, a 1189 
base pair Sphl/Narl fragment of ClGPDHg9 is suitable for translational fusion 
and a 1172 base pair BamHI/BsmAI (part.) fragment of ClGPDHg3 is suitable for 
transcriptional fusion. Larger (or smaller) promoter fragments can be used for 
expression of chimeric genes on the basis of additional clones present on the 
genetic clones. Likewise, any regulatory sequences located downstream from the 
first codon of the GPDH gene are obtained for targeted expression of chimeric 
genes from the cloned fragments of genomic DNA. 

Northern Blot analysis with polyA*-RNA from various Cuphea lanceolata 
tissues with C1GPDH2 0 cDNA as a probe shows very large amounts of RNA in 
embryos in comparison with other tissues (see Figure 5) . The increase in RNA 
correlates with increased gene expression and consequently indicates an 
extremely strong promoter. 

The following examples are presented to illustrate this invention. 

EXAMPLES 
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The plant material used in the context of the present invention was 
obtained from Cuphea lanceolata (Lythraceae) (small lanceolate tube flower) . 

Example 1 

Production of glycerol -3 -phosphate dehydrogena se cDNA* 

from Cuphea lanceolata 

A cDNA library was prepared from Cuphea lanceolate (wild type) took place 
with the help of the ZAP® cDNA synthesis kit according to the manufacturer's 
instructions (Stratagene, I*a Jolla, USA) . Messenger RNA from isolated immature 
embryos about two to three weeks old was used as raw material for the synthesis 
of the cDNAs . The cDNA library obtained in this way contained 9.5 x 10 s 
recombinant phages. 

Functional complementation for isolation of cDNAs that code for a 
glycerol -3 -phosphate dehydrogenase was performed with the E. Coli BB26-36 
strain (R.M. Bell, J. Bact. 112 (1974) pages 1065-1076) . The bacterial medium 
for culturing BB26-36 (bearing the plsB26 and plsX mutations)* was supplemented 
with 0.1% glycerol to supplement the bacteria. A medium without glycerol was 
used for functional complementation. 

The pBluescript plasmids were cut out of the above cDNA library in 1-2AP 
II according to the manufacturer's instructions (Stratagene) by in vivo 
excision using helper phages and then packed in phage coats: 200 ml of XLlBlue 
E. Coli cells (OD fi00 = 1) were infected with 5 x 10 s pfu of the 1-ZAP II cDNA 
library, and, in order to guarantee coinfection, were also infected with a 
tenfold amount of fi R408 helper phages. After incubating for 15 minutes at a 
temperature of 37°C for phage adsorption, 5 ml 2xYT medium were added and 
agitated for three hours more at a temperature of 37°C. During this time, the 
cells of the pBluescript plasmids packed in the coats of helper phages are 
secreting the so-called phagemids into the medium. The bacteria were killed 
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and the 1 phages were inactivated by a heating for 20 minutes at 70°C. After 
centrifuging, the supernatant containing helper phages along with phagemids was 
removed. This supernatant was used for infection of the mutant BB26-36 strain. 

Complementation was performed after infecting the B. coli BB2S-36 strain 
with phagemids containing cDNA plasmids that code for a glycerol -3 -phosphate 
dehydrogenase. M56-LP medium (Bell, Joe. cit.) with 50 mg ampicillin was used 
for selection (without glycerol-3-phosphate) . Retransformation of BB26-36 was . 
performed by the method of D. Hanahan, J. Mol. Biol, ififi (i 98 3) pages 5S7-S80, 
with subsequent plating on the selective medium mentioned. 

selection clones for determining the sequence of the DMA fragments of 
positive cDNA clones were produced by means of exonuclease III (Strategene) and 
were sequenced according to the method of Sanger et al., Proc. Nat. Acad. Sci. 
2A (1977) pages 5463-5467. Some of the DNA sequencing was performed 
radioactively with the help of the « Sequencing* Kit or with a Pharmacia 
Automated Laser Fluorescent A.L.F.® DNA sequencer. The sequences were analyzed 
with the help of computer software from the University of Wisconsin Genetics 
Computer Group (J. Devereux et al., Nucl. Acids Res. 12 ( 198 4) pages 387-394). 

Furthermore, cDNA clones were isolated by screening a cDNA library from 
Cuphea lanceolata with C1GPDH20 cDNA as a probe. For this, a cDNA library from 
Cuphea lanceolata (wild type) was produced according to the manufacturer's 
instructions with the ZAP® cDNA Synthesis Kit. Messenger RNA from isolated, 
immature embryos about two to three weeks old was the raw material for 
synthesis of the cDNAs. The cDNA library obtained contained 9.6 x 10 s 
recombinant phages with approx. 50% clones with more than 500 bp insertions. 
The cDNA library was examined with CLGPDH20 as a probe, and 18 cDNAs were 
isolated and partially or completely sequenced in the usual manner. of these 
cDNAs, 12 were class A, and 6 cDNAs were in class B. 
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The enzyme measurements were performed with the fusion protein according 
to the method of Santora et al.. Arch. Biochem. Biophys. ( 19 79) pages 403- 



411. 



ExanrolP ? 

Production of genomic clones of olv^mi .^ Dh o^h a ^ 
dehydrogenase fmm 

Genomic DNA from young Cuphea lanceolata leaves were isolated for this 
example (S.L. Delia Porta et al.. Plant. Mo!. Biol. Rep. 1, (i 983 , pages 19- 
21) . The DNA was then partially cleaved with the restriction enzyme Sau3 A< 
whereupon DHA fragments of 11,000 to 19.000 base pairs were cloned in vector 
1FIXII (Stratagene) that was cleaved with Xhol after the respective interfaces 
were partially filled with two nucleotides in any given case. The genomic DNA 
library that was not reproduced amounted to 5.4 times the genome of Cuphea 
lanceolata. Thirty-one genomic clones were then isolated from this library 
with C1GPDH2 6 - cDNA as a probe. 

The three genomic clones ClGPDHg3 (15.9 kb DNA insertion). ClGPDHgS (17.7 
kb DNA insertion) and ClGPDHg9 (15.6 kb DNA insertion) were characterized in 
greater detail. Suitable subclones were produced in the usual manner and their 
insertions were sequenced with the ExoIIl/Mung bean kit and also with 
oligonucleotide primers in order to bridge any gaps. 

If any of the procedures customary in molecular biology have not have 

been described adequately here, such procedures were performed by standard 

methods as described in Sambrook et al.. A Laboratory Manual, second edition 
(1989) . 
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SEQUENCE LIST 

GENERAL INFORMATION: 
(i) APPLICANT; 

(A) NAME: Max Planck Society for Promotion of 
the Sciences E.V. 

(B) STREET: Bunsenstrasse 10 

(C) CITY: Goettingen 

(E) COUNTRY: Germany 

(F) ZIP: 37073 

(ii) TITLE OF INVENTION: 

Glycerol -3 -phosphate dehydrogenase (GPDH) 

(iii) NUMBER OF SEQUENCES: 8 

(iv) COMPUTER - READABLE FORM: 

<A) MEDIUM TYPE: 3.5 inch HD diskette (1.44 MB)/ 
ASCII Format 

(B) COMPUTER: IBM compatible PC 

(C) OPERATING SYSTEM: PC-DOS /MS -DOS 

(D) SOFTWARE : Patentln Release #1.0, Version 
#1.25 (EPA) 

INFORMATION FOR ID SEQ NO:l 

<i} SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1354 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS : Double strand 

(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: cDNA to mRNA 

(iii) HYPOTHETICAL: No 
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(iv) 
(vi) 



ANTI- SENSE: No 



ORIGINAL SOURCE 



(A) ORGANISM : Cuphea " lanceolata 



(vii) IMMEDIATE SOURCE : 



(ix) 



(ix) 



(ix) 



(ix) 



(ix) 



(xi) 



(A) LIBRARY: ZAP cDNA library 

( B ) CLONE : C1GPDH2 0 



FEATURE: 

« 

(A) NAME/KEY: 



(B) 



LOCATION: 



cDNA 

15 to 1345 



FEATURE: 



(A) NAME/KEY: Fusion with lacZ 

(B) LOCATION: 1 to 14 



FEATURE: 



(A) NAME/KEY: Start codon 

(B) LOCATION: 17 to 19 



FEATURE: 



(A) NAME/KEY: Stop codon 

(B) LOCATION: 1133 to 1135 



FEATURE: 



(A) NAME/KEY: PolyA signal 

(B) LOCATION: 1329 to 1334 
SEQUENCE DESCRIPTION: SEQ ID NO:l 



GAATTCGGCA CGAGCA ATG GOT CCC TCT GAG CTC AAC TGC ACC CAC CAG 

Met Ala Pro Ser Glu Leu Asn Cys Thr His Gin 
1 5 10 

AAC CAG CAT TCA AGC GGT TAC GAC GGA CCC AGA TCG AGG GTC ACC GTT 
Asn Gin His Ser Ser Gly Tyr Asp Gly Pro Arg Ser Arg Val Thr Val 

15 20 25 
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GTC GGT AGT GGA AAC TGG GGT AGT GTT GCT GCC AAG CTC ATT GCT ACC 
Val Gly Ser Gly Asn Trp Gly Ser Val Ala Ala Lys Leu He Ala Thr 
30 35 40 

AAT ACC CTC AAG CTT CCA TCT TTT CAT GAT GAA GTG AGA ATG TGG GTA 
Asn Thr Leu Lys Leu Pro Ser Phe His Asp Glu Val Arg Met Trp Val 
45 50 55 

TTT GAG GAG ACG CTA CCG AGC GGC GAG AAG CTT ACT GAT GTC ATC AAC 
Phe Glu Glu Thr Leu Pro Ser Gly Glu Lys Leu Thr Asp Val He Asn 
60 65 . 70 75 

CAG ACC AAT GAA AAT GTT AAG TAT CTC CCC GGA' ATT AAG CTC GGT AGG 
Gin Thr Asn Glu Asn Val Lys Tyr Leu Pro Gly He Lys Leu Gly Arg 

80 85 90 

AAT GTT GTT GCA GAT CCA GAC CTC GAA AAC GCA GTT AAG GAT GCA AAT 
Asu Val Val Ala Asp Pro Asp Leu Glu Asn Ala Val Lys Asp Ala Asn 

95 100 105 

ATG CTC GTG TTT GTG ACA CCG CAT CAG TTC ATG GAG GGC ATC TGC AAA 
Met Leu Val Phe Val Thr Pro His Gin Phe Met Glu Gly He Cys Lys 
HO 115 120 

AGA CTC GAA GGG AAA ATA CAA GAA GGA GCA CAG GCT CTC TCC CTT ATA 
Arg Leu Glu Gly Lys He Gin Glu Gly Ala Gin Ala Leu Ser Leu He 
125 130 135 

AAG GGC ATG GAG GTC AAA ATG GAG GGG CCT TGC ATG ATC TCG AGC TTA 
Lys Gly Met Glu Val Lys Met Glu Gly Pro Cys Met He Ser Ser Leu 
140 145 150 155 

ATC TCT GAT CTT CTC GGG ATT AAC TGC TGT GTC CTA ATG GGG GCA AAC 
lie Ser Asp Leu Leu Gly He Asn Cys Cys Val Leu Met Gly Ala Asn 

160 165 170 

ATC GCT AAT GAG ATT GCT GTT GAG AAA TTC AGT GAA GCG ACA GTC GGG 
He Ala Asn Glu He Ala Val Glu Lys Phe Ser Glu Ala Thr Val Gly 

175 * 180 185 

TTC AGA GAA AAT AGA GAT ATT GCA GAG AAA TGG GTT CAG CTC TTT AGC 
Phe Arg Glu Asn Arg Asp He Ala Glu Lys Trp Val Gin Leu Phe Ser 
190 195 200 

ACT CCG TAC TTC ATG GTC TCA GCT GTT GAA GAT GTT GAA GGA GTA GAA 
Thr Pro Tyr Phe Met Val Ser Ala Val Glu Asp Val Glu Gly Val Glu 
205 210 215 

CTT TGT GGA ACA CTG AAG AAT ATC GTG GCC ATA GCA GCC GGT TTT GTG 
Leu Cys Gly Thr Leu Lys Asn He Val Ala He Ala Ala Gly Phe Val 
220 225 230 * 235 

GAT GGA TTG GAG ATG GGA AAC AAC ACA AAA GCA GCA ATT ATG AGG ATC 
Asp Gly Leu Glu Met Gly Asn Asn Thr Lys Ala Ala He Met Arg He 
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240 245 250 

GGG TTA CGG GAG ATG AAG GCA TTC TCC AAG CTT TTG TTT CCA TCT GTT 817 
Gly Leu Arg Glu Met Lys Ala Phe Ser Lys Leu Leu Phe Pro Ser Val 

255 260 265 

AAG GAC ACT ACT TTC TTC GAG AGC TGT GGA GTC GCT GAC CTC ATC ACA 865 
Lys Asp Thr Thr Phe Phe Glu Ser Cys Gly Val Ala Asp Leu He Thr 
270 275 280 

ACT TGT TTG GGC GGG AGA AAC AGA AAA GTT GCT GAG GCT TTT GCA AAG 913 
Thr Cys Leu Gly Gly Arg Asn Arg Lys Val Ala Glu Ala Phe Ala Lys 
285 290 295 

AAT GGC GGG AAA AGG TCA TTC GAT GAT CTC GAA GCA GAG ATG CTC CGG 961 
Asn Gly Gly Lys Arg Ser Phe Asp Asp Leu Glu Ala Glu Met Leu Arg 
300 305 310 315 

GGG CAA AAA TTA CAG GGT GTC TCA ACA GCA AAG GAG GTC TAT GAA GTC 1009 
Gly Gin Lys Leu Gin Gly Val Ser Thr Ala Lys Glu Val Tyr Glu Val 

320 325 330 

TTG GGG CAC CGA GGC TGG CTC GAG CTG TTC CCG CTC TTC TCA ACC GTG 1057 
Leu Gly His Arg Gly Trp Leu Glu Leu Phe Pro Leu Phe Ser Thr Val 

335 340 345 

CAC GAG ATA TCC ACT GGC CGT CTG CCT CCT TCA GCC ATC GTC GAA TAC 1105 
His Glu He Ser Thr Gly Arg Leu Pro Pro Ser Ala He Val Glu Tyr 
350 355 360 

AGC GAA CAA AAA ACC ATC TTC TCT TGG TAGAGCAAGA GGCTGCCCTT 1152 
Ser Glu Gin Lys Thr He Phe Ser Trp 
365 370 

GAAAGACTAA GAGCCACCCT GCCCTGTTTA AAGGGCTAAA AGTTTAATAT TTCTCTGCAG 1212 
CCTAAACAGT CGGAAACATT GAAAATCTAG GATGTATAAG AAAAAAAAAA GAAGGTTTGA 1272 
AGGAAGTATG GATGGGCATG AATGTATTTA TTTTCGGTAT ACTCTTTTTC TGCAAAAATA 1332 
ATTTCTTCAG AAAGGGGGGC CC 1354 
(2) INFORMATION FOR ID SEQ NO: 2 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1464 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS : Double stranded 

(D) TOPOLOGY: Linear 
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(ii) 



MOLECULE TYPE: 



cDNA to mRNA 



(iii) HYPOTHETICAL: 



No 



(iv) 
(vi) 



ANTI- SENSE: No 



ORIGINAL SOURCE: 



(A) ORGANISM: Cuphea lanceolata 



(vii) IMMEDIATE SOURCE 



(ix) 



<ix) 



(ix) 



(ix) 



(ix) 



(ix) 



(ix) 



(A) LIBRARY: ZAP cDNA library 



(B) 



CLONE : C1GPDH109 



FEATURE: 



(A) NAME/KEY: cDNA 



(B) 



LOCATION: 



15 to 1454 



FEATURE: 



(A) NAME /KEY: CDS [coding sequence] 



(B) 



LOCATION: 



15 to 1187 



FEATURE: 



(A) NAME/KEY: Fusion with lacZ 



(B) 



LOCATION: 



1 to 14 



FEATURE: 



(A) NAME/KEY: Start codon 

(B) LOCATION: 45 to 47 



FEATURE: 



(A) NAME/KEY: Stop codon 



(B) 



LOCATION: 



1188 to 1190 



FEATURE: 



(A) NAME/KEY: PolyA signal 



(B) 



LOCATION: 



1414 to 1419 



FEATURE: 
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(A) NAME/KEY: PolyA region 

(B) LOCATION: 1446 to 1454 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2 



GAATTCGGCA CGAGCTTCCT CTGTTCTTCC TCTCTGCCTC TGCA ATG GCG CCT GCC 56 

Met Ala Pro Ala 

1 

TTC GAA CCC CAT CAG CTG GCT CCC TCT GAG CTT AAC TCT GCC CAC CAG 104 
Phe Glu Pro His Gin Leu Ala Pro Ser Glu Leu Asn Ser Ala His Gin 
5 10 15 20 

AAC CCA CAT TCA GGC GGA TAT GAC GGA CCC AGA TCG AGG GTC ACT GTC 152 
Asn Pro His Ser Gly Gly.Tyr Asp Gly Pro Arg Ser Arg Val Thr Val 

25 30 35 

GTC GGC AGC GGC AAC TGG GGC AGC GTC GCT GCC AAG CTC ATT GCT TCC 200 
Val Gly Ser Gly Asn Trp Gly Ser Val Ala Ala Lys Leu lie Ala Ser 

40 45 50 

AAC ACC CTC AAG CTC CCA TCT TTC CAT GAT GAA GTG AGG ATG TGG GTA 248 
Asu Thr Leu Lys Leu Pro Ser Phe His Asp Glu Val Arg Met Trp Val 
55 60 65 

TTT GAG GAG ACT CTA CCG GGC GGC GAG AAG CTC ACT GAT ATC ATC AAC 296 
Phe Glu Glu Thr Leu Pro Gly Gly Glu Lys Leu Thr Asp lie lie Asn 
70 75 80 

CAG ACC AAT GAA AAT GTT AAA TAT CTT CCC GGA ATT AAG CTC GGT GGG 344 
Glu Thr Asn Glu Asn Val Lys Tyr Leu Pro Gly lie Lys Leu Gly Gly 
85 90 95 100 

AAT GTT GTT GCT GAT CCA GAC CTC GAA AAT GCA GTT AAG GAT GCA AAT 392 
Asn Val Val Ala Asp Pro Asp Leu Glu Asn Ala Val Lys Asp Ala Asn 

105 110 115 

ATG CTC GTG TTT GTC ACA CCG CAT CAG TTC ATG GAG GGC ATC TGC AAA 440 
Met Leu Val Phe Val Thr Pro His Gin Phe Met Glu Gly He Cys Lys 

120 125 130 

AGA CTT GTC GGG AAG ATA CAG GAA GGA GCG CAG GCT CTC TCC CTT ATA 488 
Arg Leu Val Gly Lys He Gin Glu Gly Ala Gin Ala Leu Ser Leu He 
135 140 145 

AAA GGC ATG GAG GTC AAG ATG GAG GGG CCT TGC ATG ATC TCG AGC CTA 536 
Lys Gly Met Glu Val Lys Met Glu Gly Pro Cys Met He Ser Ser Leu 
150 155 160 

ATC TCA GAT CTT CTC GGG ATC AAC TGC TGT GTC CTT AAT GGG GCA AAC 584 
He Ser Asp Leu Leu Gly He Asu Cys Cys Val Leu Asn Gly Ala Asn 
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165 170 175 180 

ATC GCT AAT GAG ATT GCT GTT GAG AAA TTC AGT GAA GCG ACT GTC GGG 
lie Ala Asn Glu lie Ala Val Glu Lys Phe Ser Glu Ala Thr Val Gly 

185 190 195 

TTC AGA GAA AAT AGA GAT ATT GCG GAA AAA TGG GTT CAG CTC TTT AGC 
Phe Arg Glu Asn Arg Asp lie Ala Glu Lys Trp Val Gin Leu Phe Ser 

200 205 210 

ACT CCA TAC TTC ATG GTC TCA GCT GTT GAA GAT GTT GAA GGA GTA GAG 
Thr Pro Tyr Phe Met Val Ser Ala Val Glu Asp Val Glu Gly Val Glu 
215 220 225 

CTT TGT GGA ACA CTG AAG AAT ATT GTG GCC ATA GCA GCG GGT TTT GTT 
Leu Cys Gly Thr Leu Lys Asn lie Val Ala lie Ala Ala Gly Phe Val 
230 235 240 

GAT GGA TTG GAG ATG GGA AAC AAC ACA AAA GCG GCA ATT ATG AGG ATC 
Asp Gly Leu Glu Met Gly Asn Asn Thr Lys Ala Ala He Met Arg He 
245 250 255 260 

GGG CTG CGG GAG ATG AAA GCG TTC TCC AAG CTT TTG TTT CCA TCT GTT 
Gly Leu Arg Glu Met Lys Ala Phe Ser Lys Leu Leu Phe Pro Ser Val 

265 270 275 

AAG GAC ACT ACT TTT TTC GAG AGC TGC GGA GTC GCT GAT CTC ATC ACA 
Lys Asp Thr Thr Phe Phe Glu Ser Cys Gly Val Ala Asp Leu He Thr 

280 285 290 

ACT TGT TTG GGC GGA AGA AAC AGA AAA GTC GCT GAG GCT TTT GCA AAG 
Thr Cys Leu Gly Gly Arg Asn Arg Lys Val Ala Glu Ala Phe Ala Lys 
295 300 305 

AAT GGC GGA AAC AGG TCA TTT GAT GAT CTC GAA GCA GAG ATG CTC CGG 
Asn Gly Gly Asn Arg Ser Phe Asp Asp Leu Glu Ala Glu Met Leu Arg 
310 315 320 

GGG CAA AAA TTA CAG GGT GTC TCG ACA GCG AAA GAG GTC TAC GAG GTC 
Gly Gin Lys Leu Gin Gly Val Ser Thr Ala Lys Glu Val Tyr Glu Val 
325 330 335 340 

CTG AGG CAC CGA GGC TGG CTC GAG TTG TTC CCG CTC TTC TCA ACC GTG 
Leu Arg His Arg Gly Trp Leu Glu Leu Phe Pro Leu Phe Ser Thr Val 

345 350 355 

CAT GAG ATC TCC AGT GGC CGT CTG CCT CCT TCA GCC ATT GTT GAA TAC 
His Glu He Ser Ser Gly Arg Leu Pro Pro Ser Ala He Val Glu Tyr 

360 365 370 

AGC GAA CAA AAG CCT ACC TTC TCT TGG TAGAGAAAGA AACCAGGAAG 
Ser Glu Gin Lys Pro Thr Phe Ser Trp 
375 380 
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AACGGCGAGC CACTGTCCCC CGTTTAAAGG TTTACTATTT CTCTCTGCAC TTTGCAGCCT 1267 
GAAGAGTCGG AAACATAGAA AATCTAGGAA GTTTCAGAAA AAGGAAGGTT TTGAGGATGT 1327 
ATGGAtGATA TATATACTAG GTGGGTATGA AGAGGAAGTT ATTACTATGA TGTTGGTATG 1387 
TGGTAATGGC TAAGTACATG AGATCAAATA AATAGACAGA CCTTGGTTTC TTCTTTCTAA 1447 
AAAAAAAGGG GGGGCCC 1464 



(2) 



INFORMATION FOR ID SEQ NO: 3 



(i) 



SEQUENCE CHARACTERISTICS : 



(A) LENGTH: 



1490 base pairs 



(B) TYPE: Nucleic acid 



(C) 



STRANDEDNESS : 



Double 



(D) 



TOPOLOGY: 



Linear 



(ii) 



MOLECULE TYPE: 



cDNA to mRNA 



(iii) HYPOTHETICAL: 



No 



(iv) 



ANTI -SENSE: No 



(vi) 



ORIGINAL SOURCE: 



(A) ORGANISM: 



Cuphea lanceolata 



(vii) IMMEDIATE SOURCE: 



(A) 



LIBRARY: 



ZAP cDNA library 



(B) 



CLONE : C1GPDH132 



(ix) 



FEATURE: 



(A) NAME/KEY: CDS 



(B) LOCATION 



15 to 1115 



(ix) 



FEATURE: 



(A) NAME/KEY : Fusion with lacZ 



(B) LOCATION: 



1 to 14 



(ix) 



FEATURE: 



(A) NAME/KEY: Stop codon 



(B) 



LOCATION : 



1116 to 1118 
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(ix) FEATURE: 

(A) NAME/KEY: PolyA signal 

(B) LOCATION: 1343 to 1348 
(ix) FEATURE: 

(A) NAME /KEY: PolyA region 

(B) LOCATION: 1465 to 1484 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3 

GAATTCGGCA CGAG CTT AAC TCT GCC CAC CAG AAC CCA CAT TCC AGC GGA 50 

Leu Asn Ser Ala His Gin Asn Pro His Ser Ser Gly 
1 5 10 

TAT GAA GGA CCC AGA TCG AGG GTC ACC GTC GTT GGC AGC GGC AAC TGG 98 
Tyr Glu Gly Pro Arg Ser Arg Val Thr Val Val Gly Ser Gly Asn Trp . 
15 20 25 

GGC AGC GTC GCT GCC AAG CTC ATT GCT TCC AAC ACC CTC AAG CTC CCA 146 
Gly Ser Val Ala Ala Lys Leu lie Ala Ser Asn Thr Leu Lys Leu Pro 
30 35 40 

TCT TTC CAT GAT GAA GTG AGG ATG TGG GTA TTT GAG GAG ACT CTA CCG 194 
Ser Phe His Asp Glu Val Arg Met Trp Val Phe Glu Glu Thr Leu Pro 
45 50 55 60 

GGC GGC GAG AAG CTC ACT GAT ATC ATC AAC CAG ACC AAT GAA AAT GTT 242 
Gly Gly Glu Lys Leu Thr Asp lie lie Asn Gin Thr Asn Glu Asn Val 

65 70 75 

AAA TAT CTT CCC GGA ATT AAG CTC GGT AGG AAT GTT GTT GCA GAT CCA 290 
Lys Tyr Leu Pro Gly lie Lys Leu Gly Arg Asn Val Val Ala Asp Pro 

80 85 90 

GAC CTC GAA AAC GCA GTT AAG GAT GCA AAT ATG CTC GTT TTC GTC ACA 33 8 
Asp Leu Glu Asn Ala Val Lys Asp Ala Asn Met Lou Val Phe Val Thr 
95 100 105 

CCG CAT CAG TTC GTG GAG GGC ATC TGC AAA AGA CTT GTA GGG AAG ATA 386 
Pro His Gin Phe Val Glu Gly He Cys Lys Arg Leu Val Gly Lys He 
110 115 120 

CAG GAA GGA GCG CAG GCT CTC TCT CTT ATA AAA GGC ATG GAG GTC AAA 434 
Gin Glu Gly Ala Gin Ala Leu Ser Leu He Lys Gly Met Glu Val Lys 
125 130 135 140 

ATG GAG GGG CCT TGC ATG ATC TCG AGC CTA ATC TCA GAT CTT CTC GGG 4 82 
Met Glu Gly Pro Cys Met He Ser Ser Leu He Ser Asp Leu Leu Gly 

145 150 155 
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ATC AAT TGC TGT GTC CTT AAT GGG GCG AAC ATC GCT AAT GAG ATT GCT 530 
He Asn Cys Cys Val Leu Asn Gly Ala Asn He Ala Asn Glu He Ala 

160 165 X70 

GTT GAG AAA TTC AGT GAA GCG ACT GTC GGG TTC AGA GAA AAT AGA GAT 578 
Val Glu Lys Phe Ser Glu Ala Thr Val Gly Phe Arg Glu Asn Arg Asp 
175 180 185 

ATT GCG GAA AAA TGG GTT CAG CTC TTT AGC ACT CCA TAC TTC ATG GTC 626 
He Ala Glu Lys Trp Val Gin Leu Phe Ser Thr Pro Tyr Phe Met Val 
190 195 200 

TCA GCT GTT GAA GAT GTT GAA GGA GTA GAG CTT TGT GGA ACA CTG AAG 674 
Ser Ala Val Glu Asp Val Glu Gly Val Glu Leu Cys Gly Thr Leu Lys 
205 210 215 220 

AAT ATT GTG GCC ATA GCA GCG GGT TTT GTG GAT GGA CTG GAG ATG GGA 722 
Asu He Val Ala He Ala Ala Gly Phe Val Asp Gly Leu Glu Met Gly 

225 230 235 

AAC AAC ACA AAA GCA GCA ATT ATG AGG ATC GGG CTG CGG GAG ATG AAA 770 
Asn Asn Thr Lys Ala Ala He Met Arg He Gly Leu Arg Glu Met Lys 

240 245 250 

GCG TTC TCC AAG CTT TTG TTT CCA TCT GTT AAG GAC ACT ACT TTT TTC 818 
Ala Phe Ser Lys Leu Leu Phe Pro Ser Val Lys Asp Thr Thr Phe Phe 
255 260 265 

GAG AGC TGC GGA GTC GCT GAT CTC ATC ACA ACT TGT TTG GGC GGA AGA 866 
Glu Ser Cys Gly Val Ala Asp Leu He Thr Thr Cys Leu Gly Gly Arg 
270 275 280 

AAC AGA AAA GTC GCT GAG GCT TTT GCA AAG AAT GGC GGT AAC AGG TCA 914 
Asn Arg Lys Val Ala Glu Ala Phe Ala Lys Asn Gly Gly Asn Arg Ser 
285 290 295 300 

TTC GAT GAT CTC GAA GCA GAG ATG CTC CGG GGG CM AAA TTA CAG GGT 962 
Phe Asp Asp Leu Glu Ala Glu Met Leu Arg Gly Gin Lys Leu Gin Gly 

305 310 315 

GTC TCG ACA GCG AAA GAG GTC TAC GAG GTC CTG AGG CAC CGA GGT TGG 1010 
Val Ser Thr Ala Lys Glu Val Tyr Glu Val Leu Arg His Arg Gly Trp 

32 0 325 330 

CTC GAG TTG TTC CCG CTC TTC TCA ACC GTG CAT GAG ATC TCC ACT GGC 1058 
Leu Glu Leu Phe Pro Leu Phe Ser Thr Val His Glu He Ser Thr Gly 
335 340 34S 

CGT CTG CCT CCT TCA GCC ATT GTT GAA TAC AGC GAA CAA AAG CCC ACC 1106 
Arg Leu Pro Pro Ser Ala He Val Glu Tyr Ser Glu Gin Lys Pro Thr 
350 355 360 

TTC TCT TGG TAGAGAAAGA AGCAACCAGG AAGAACGGCG AGCCACTCTG 1155 
Phe Ser Trp 
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365 



CCTCGTTTAA AGGGTTACTA TTTCTCTACA CTCTGCAGCC TGAAGAGTCG GAAACATCGA 1215 
AAATCTAGGA AGTCTCAGAA AAATGAAGGT TTGGAGGATG TATGGATGAT ATATATACTA 1275 
GGTGGGTATG AAGAGGAAGT TATTACTATG ATGTTGGTAT GTGGTAATGG CTAAGTACAT 1335 
GAGATCAAAT AAATAGACAG ACCTTGGTTT CTTCTATCTC GATTCGGTCT CGTCGAGTTT 1395 
GGCGAAACTC AACTGAACTT CCTGAGTACC CTGCTACCTA TTACATGTAA TGTTCCTATT 1455 
TATATGCTTA AAAAAAAAAA AAAAAAAAAC TCGAG 1490 
(2) INFORMATION FOR ID SEQ NO: 4 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1390 base pairs 

(B) TYPE: Nucleic acid 

(C) * STRANDEDNESS : Double strand 

(D) TOPOLOGY: Linear 
MOLECULE TYPE: cDNA to mRNA 

(iii) HYPOTHETICAL: No 

(iv) ANTI-SENSE: No 
<vi) ORIGINAL SOURCE: 

* 

(A) ORGANISM: Cuphea lanceolata * 
(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: ZAP cDNA library 

(B) CLONE : C1GPDH3 0 
(ix) FEATURE: 

(A) NAME/KEY: cDNA 

(B) LOCATION: 15 to 1384 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 34 to 1149 
(ix) FEATURE: 

(A) NAME/KEY: Fusion with lacZ 

(B) LOCATION: 1 to 14 • 
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(ix) 



FEATURE: 



(ix) 



(A) NAME/KEY: Start codon 



(B) 



LOCATION: 



FEATURE: 



34 to 36 



(A) NAME/KEY: Stop codon 



(B) 



LOCATION: 



1150 to 1152 



(ix) 



FEATURE : 



(ix) 



(xi) 



(A) NAME/KEY: PolyA signal 

4 

(B) LOCATION: 1349 to 1354 
FEATURE: 

(A) NAME/KEY: PolyA region 



(B) 



LOCATION: 



1366 to 1384 



SEQUENCE DESCRIPTION: SEQ ID NO: 4 



GAATTCGGCA CGAGTTTCTT CTCAGCCTCT GCA ATG GCT CCC TCT GAG CTC AAC 

Met Ala Pro Ser Giu Leu Asn 
1 5 

TGC ACC CAC CAG AAC CCA CAT TCA AGC GGT TAC GAC GGA CCC AGA TCG 
Cys Thr His Gin Asn Pro His Ser Ser Gly Tyr Asp Gly Pro Arg Ser 
10 15 20 

AGG GTC ACC GTT GTC GGT AGT GGA AAC TGG GGC AGT GTC GCT GCC AAG 
Arg Val Thr Val Val Gly Ser Gly Asn Trp Gly Ser Val Ala Ala Lys 
25 30 35 

CTC ATT GCT TCC AAT ACC CTC AAG CTT CCA TCT TTT CAT GAT GAA 
Leu He Ala Ser Asn Thr Leu Lys Leu Pro Ser Phe His Asp Glu 
40 45 50 

AGA ATG TGG GTA TTT GAG GAG ACT CTA CCG AGC GGC GAG AAG CTT ACT 
Arg Met Trp Val Phe Glu Glu Thr Leu Pro Ser Gly Glu Lys Leu Thr 

60 65 70 

GAT GTC ATC AAC CAG ACC AAT GAA AAT GTT AAG TAT CTC CCC GGA ATT 
Asp Val He Asn Gin Thr Asn Glu Asn Val Lys Tyr Leu Pro Gly He 

75 80 85 

AAG CTC GGT AGG AAT GTT GTT GCA GAT CCA GAC CTC GAA AAC GCA GTT 
Lys Leu Gly Arg Asn Val Val Ala Asp Pro Asp Leu Glu Asn Ala Val 



54 



102 



150 



294 



342 
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90 95 100 

AAG GAT GCA AAT ATG CTC GTG TTT GTG ACA CCG CAT CAG TTC ATG GAG 390 
Lys Asp Ala Asn Met Leu Val Phe Val Thr Pro His Gin Phe Met Glu 
105 HO lis 

GGC ATC TGC AAA AGA CTC GTA GGG AAA ATA CAG GAA GGA GCA CAG GCT 438 
Gly lie Cys Lys Arg Leu Val Gly Lys He Gin Glu Gly Ala Gin Ala 
120 125 130 135 

CTC TCC CTT ATA AAG GGC ATG GAG GTC AAA ATG GAG GGG CCT TGC ATG 486 
Leu Ser Leu He Lys Gly Met Glu Val Lys Met Glu Gly Pro Cys Met 

140 145 150 

ATC TCG AGC CTA ATC TCT GAT CTT CTC GGG ATC AAC TGC TGT GTC CTA 534 
He Ser Ser Leu He Ser Asp Leu Leu Gly He Asn Cys Cys Val Leu 

155 160 165 

ATG GGG GCA AAC ATC GCT AAT GAG ATT GCT GTT GAG AAA TTC AGT GAA 582 
Met Gly Ala Asn He Ala Asn Glu lie Ala Val Glu Lys Phe Ser Glu 
170 175 150 

GCG ACA GTC GGG TTC AGA GAA AAT ACA GAT ATT GCG GAG AAA TGG GTT 630 
Ala Thr' Val Gly Phe Arg Glu Asn Thr Asp He Ala Glu Lys Trp Val 
185 190 195 

CAG CTC TTT AGC ACT CCG TAC TTC ATG GTC TCA GCT GTT GAA GAT GTT 678 
Gin Leu Phe Ser Thr Pro Tyr Phe Met Val Ser Ala Val Glu Asp Val 
200 205 210 215 

GAA GGA GTA GAA CTT TGT GGA ACA CTG AAG AAT ATC GTG GCC ATA GCA 726 
Glu Gly Val Glu Leu Cys Gly Thr Leu Lys Asn He Val Ala He Ala 

220 225 230 

GCC GGT TTT GTG GAT GGA TTG GAG ATG GGA AAC AAC ACA AAA GCA GCA 774 
Ala Gly Phe Val Asp Gly Leu Glu Met Gly Asn Asn Thr Lys Ala Ala 

235 240 245 

ATT ATG AGG ATC GGG TTA CGG GAG ATG AAG GCA TTC TCC AAG CTT TTG 822 
He Met Arg He Gly Leu Arg Glu Met Lys Ala Phe Ser Lys Leu Leu 
250 255 260 

TTT CCA TCT GTT AAG GAC ACT ACT TTC TTC GAG AGC TGT GGA GTT GCT 870 
Phe Pro Ser Val Lys Asp Thr Thr Phe Phe Glu Ser Cys Gly Val Ala 
265 270 275 

» 

GAC CTC ATC ACA ACT TGT TTG GGC GGG AGA AAC AGA AAA GTT GCT GAG 918 
Asp Leu He Thr Thr Cys Leu Gly Gly Arg Asn Arg Lys Val Ala Glu 
280 285 290 295 

GCT TTT GCA AAG KAT GGC GGG GAA AGG TCA TTC GAT GAT CTC GAA GCA 966 
Ala Phe Ala Lys Asn Gly Gly Glu Arg Ser Phe Asp Asp Leu Glu Ala 

300 305 310 
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GAG CTG CTC CGG GGG CAA AAA TTA CAG GGT GTC TCA ACA GCA AAG GAG 
Glu Leu Leu Arg Gly Gin Lys Lou Gin Gly Val Ser Thr Ala Lys Glu 

315 320 325 

GTC TAT GAA GTC TTG GGG CAC CGA GGC TGG CTC GAG CTG TTC CCG CTC 
Val Tyr Glu Val Leu Gly His Arg Gly Trp Leu Glu Leu Phe Pro Leu 
330 335 340 

TTC TCA ACC GTG CAC GAG ATC TCC ACT GGC CGT CTG CAT CCT TCA GCC 
Phe Ser Thr Val His Glu lie Ser Thr Gly Arg Leu His Pro Ser Ala 
345 350 355 

ATC GTC GAA TAC AGC GAA CAA AAA ACC ATC TTC TCT TGG TAGAGCAAGA 
He Val Glu Tyr Ser Glu Gin Lys Thr He Phe Ser Trp 
360 365 370 

GGCTGCCCTT GAAAGACTAA GAGCCACCCT GCCCTGTTTA AAGGGCTAAA AGTTTAATAT 
TTCTCTGCAG CCTAAACAGT TGGAAACATT GAAAATCTAG GATGTATCAG AAAAAAGAAG 
GTTTGGAGGA AGTATGGATG ATATAGAGGA CATGAATGTA TTCATTTTCG GTATACTCTT 
TTTCTGCAAA ATAATTCTTC AGATGTAAAA AAAAAAAAAA AAAAACTCGA G 



1014 



1062 



1110 



1159 



1215 
1279 
1339 
1390 



(2) INFORMATION FOR ID SEQ NO: 5 



(i) 



SEQUENCE CHARACTERISTICS: 



(A) LENGTH: 



4434 base pairs 



(B) TYPE: Nucleic acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: Linear 



Double strand 



(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: No 



(iv) 



ANTI- SENSE: No 



(vi) 



ORIGINAL SOURCE 



(A) ORGANISM: Cuphea lanceolata 
(vii) IMMEDIATE SOURCE: 



(A) LIBRARY: 



Genomic lambda FIX II 



(ix) 



(B) CLONE : ClGPDHgS 



FEATURE: 



(A) NAME/KEY: TATA signal 
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(ix) 



(B) LOCATION: 
FEATURE: 



1332 to 1336 



(A) NAME/KEY: Start codon 



(B) LOCATION: 1354 to 1396 



(ix) 



FEATURE 



(ix) 



(A) NAME/KEY: CDS 

(B) LOCATION: Join (1394 to 1550, 2066 to 2142, 2241 to 

2313, 2405 to 2622, 2719 to 2826, 2961 to 

3024, 3233 to 3260, 3342 to 3462, 3541 to 

3595, 3692 to 3740, 3580 to 4005) 

FEATURE: 



(ix) 



(A) NAME /KEY: Stop codon 

(B) LOCATION: 4006 to 4008 



FEATURE: 



(xi) 



(A) NAME/KEY: PolyA signal 

(B) LOCATION: 4205 to 4210 



SEQUENCE DESCRIPTION: SEQ ID NO: 5 



GGATCCTTAG AAGACAAGCG CGGGGCGGGC 
CCCATTCCAT CCCTATATGG TAAGCAGATC 
TCCAGATGAT TTTGTCCCTC CCTCTAGCTG 
GAACCAGCTC GTGTGGAAGG TAGGCGGAGA 
AAGCGAAGAA AGAAACTGGG TTGTCTAGCA 
CTGATAGGCT CTGATTCAAT AGAAGCCAAT 
TTCCAATCGA CCACCCTATG TACTTGCTGA 
ATAACGCTGA TGCTGTCGTC T TT TTT GTGA 
ATTTGTGACT GAGCACCCGC ACCCAAAGGC 
CCTTCGATAG AAAGGCTTCA TTCATCTTCC 
GTGCTGGACT AGTGACCGTG GACCAATTGA 
GGCCCCAAAT GAAGTTGCCG CAATGTCTTT 
GTGTGGACTG CATGGAGAAG GATGGCAGAG 
GCCCAGCCAT TGACAGTGTC GATGCCGACC 
GTTTTGGATT TTATATACCG GTGGTGATGG 
ACCGTGCAAT ATATATTGCA ACATTCCAAA 
CTAATCTTGG AGATAGGGCT TAAATTGTTA 
ATAAGAACTC TATAAAATGC TTATAGATCA 



ATGGGTCTCG TGATACCCGC CCCATTTTGC 60 

TCACTGAAAA GTCACCGTTT CTGGATGGTT 12 0 

CATTAGGTGA TGGGATTGAG GCTATTCTAA 18 0 

TTAGCTCCCA GTTCCATCCT CCTGTATTTG 24 0 

TGTTTTGTGG GACAGGTTTG GTCGTCTTTT 300 

TATCTCTCCA AAAGGAAACC TTATTACCAC 36 0 

TCTTCGGCCA GGTATCGCAT AAAGCATTCC 42 0 

ATGTTGGCAA GAGTGTGTCT GGCATGGCAT 480 

TCTGAGGTTG TGATGCCATA TCCCAACATA 54 0 

GTAGCTTACG AATGCCAAGA CCACCCCATG 600 

CCAAATGCAC CTTCCTTTGC TCCATTGAAT 660 

CGATTTCATC AAGTGTTCCA TGAGGAATAC 720 

CCGTCAAGAC AGATTTCACC AGCGTCACCC 78 0 

AACCAGCAAG TCTTGCTTTT ACCTCGACAT 84 0 

TGTTTGAATT AATCATCGTC ATTAATTTAT 900 

GTATAATTAA TTTTATATGT CCATTCGTGA 96 0 

TATGATGATA TAGAAGAAGT TGGATAGCAC 1020 

TGGCATCGAA TTCATCCGCT ATATATGAGT 108 0 
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GAGGAAGAAA 
TGAATCCTAA 
AAAACAAAGG 
GCCACGGAGC 
CTACATCCCC 



CTAATCAAAA 
GACATACTTG 
AAACAAAAGC 
CTTACATGCC 
TTATATCCCT 



CCTCGTATTC 
ACGTCATGAT 
ATAGAGGAGA 
GATGCCTTCC 
TCCTCCTTCC 



ATCGAAAGAA CCGTTGAAGT 
TCTGTCTCTC TATTCCATTG 
TCGCCAGATT CAGCAGTTTC 
TCTGCCTCCT TCTTCCTCCT 
CTCCATCTTC ACCATTCCTC 



GGTTACACTT 
CATAATAAAT 
CGCATAGGTT 
GTCTCTCTCT 
TGTTTTTCTT 



CTCAGCCTCT GCA ATG GCT CCC TCT GAG CTC AAC TGC ACC CAC CAG AAC 

Met Ala Pro Ser Glu Leu Asn Cys Thr His Gin Asn 
1 5 10 

CCA CAT TCA AGC GGT TAC GAC GGA . CCC AGA TCG AGG GTC ACC GTT GTC 
Pro His Ser Ser Gly Tyr Asp Gly Pro Arg Ser Arg Val Thr Val Val 
15 20 25 

GGT AGT GGA AAC TGG GGC AGT GTC GCT GCC AAG CTC ATT GCT TCC AAT 
Gly Ser Gly Asn Trp Gly Ser Val Ala Ala Lys Leu lie Ala Ser Asn 
30 35 40 



ACC CTC AAG CTT CCA TCT TTT CAT 
Thr Leu Lys Leu Pro Ser Phe His 
45 50 



G GTTCGTCTCT CCTTTTCTCT 



1140 
1200 
1260 
1320 
1380 

1429 



1477 



1525 



1570 



GAAAAATGAA 
AGCGCTTGAG 
GAGGTTCCTT 
ATCGAACCTC 
AAGGAGTACG 
TTGACATATT 
CATTTGAATT 
CAGTACTTCT 
ACATGAATTT 



GCTTTTGCAT GGGATAGTCA CTAGATATGA 
TAACCGAGTT TTTGGAACAA GAGCACAGGT 
AATCATTCAA TGAAGTAGCG GTTGATCGCT 
CAGCCGAGTC TTAGTGTAAT TGCTTTCTGT 
AACTGATGAG TGATGTCACA TTTCATTAGT 
GGTCGAGACT CTGCAGTGTC ATCAGATATG 
TGGTATGTGT ATGAATTTTG TTGAATTAAT 
TCGGTCATTT TTCAGGTGGA AGGATGTTGG 
TTCAG AT GAA GTG AGA ATG TGG GTA 
Asp Glu Val Arg Met Trp Val 
55 



GCCTCTGTTT GCATGACTGA 
GGTTCCTTTG CATTTTCTTT 
GAGCAATTGA AACTTGTGGA 
TTTACTTCAT TCATAGTGGG 
CGGGTTGCGA AAAAACTCAG 
AGTTGGTGTA TTTGTATTGA 
CACCGCTGTG ATGAAAAGAT 
TTTCTTATAT ATGTAACTTT 
TTT GAG GAG ACT CTA 
Phe Glu Glu Thr Leu 
60 



CCG AGC GGC GAG AAG CTT ACT GAT GTC ATC AAC CAG ACC AAT 
Pro Ser Gly Glu Lys Leu Thr Asp Val He Asn Gin Thr Asn 
65 70 75 

GTAAGGAAAC ACAGATTAGC AATAGCATGA GCAGTTATTG CTGGTTAAAT ATGCTTGTTA 

GCAACTTTCG TGACGGCCTG AGTTTTATAC CTCTGCAG GAA AAT GTT AAG TAT 

Glu Asn Val Lys Tyr 
80 

CTC CCC GGA ATT AAG CTC GGT AGG AAT GTT GTT GCA GAT CCA GAC CTC 
Leu Pro Gly He Lys Leu Gly Arg Asn Val Val Ala Asp Pro Asp Leu 
85 90 95 

GAA AAC GCA G GTAGTCCATG TGTTCATTAG AATTCTCTAA TTAATTATTG 

Glu Asn Ala 

100 



1630 
1690 
1750 

is 10 

1870 
1930 
1990 
2050 
2100 



2142 



2202 
2255 



2303 



2353 



TGGTTTATTT CCTTGTCTCT GTGATGATAT TCTGGATGAA ATTTTGTGCA G TT AAG 

Val Lys 



2409 
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GAT GCA ART ATG CTC GTG TTT GTG ACA CCG CAT CAG TTC ATG GAG GGC 24S7 
Asp Ala Asn Met Leu Val Phe Val Thr Pro His Gin Phe Met Glu Gly 
105 110 US 120 

ATC TGC AAA AGA. CTC GTA GGG AAA ATA CAG GAA GGA GCA CAG GCT CTC 2S05 
He Cys Lys Arg Leu Val Gly Lys He Gin Glu Gly Ala Gin Ala Leu 

125 130 135 

TCC CTT ATA AAG GGC ATG GAG GTC AAA ATG GAG GGG CCT TGC ATG ATC 2553 
Ser Leu He Lys Gly Met Glu Val Lys Met Glu Gly Pro Cys Met He 

140 145 150 



TCG AGC CTA ATC TCT GAT CTT CTC GGG ATC AAC TGC TGT GTC CTA ATG 
Ser Ser Leu He Ser Asp Leu Leu Gly He Asn Cys Cys Val Leu Met 
155 160 165 

GGG GCA AAC ATC GCT AAT GAG GTAAACACTT GGCACGATCT GGTTGCAACT 
Gly Ala Asn He Ala Asn Glu 
170 175 

CCCCCAGGAA ATTGTAGATC CTCATACTGT TAGCATCTTG ATGAGGTTAA ATATCTTATG 

TTGTAG ATT GCT GTT GAG AAA TTC AGT GAA GCG ACA GTC GGG TTC AGA 
He Ala Val Glu Lys Phe Ser Glu Ala Thr Val Gly Phe Arg 

180 185 

GAA AAT ACA GAT ATT GCG GAG AAA TGG GTT CAG CTC TTT AGC ACT CCG 
Glu Asn Thr Asp He Ala Glu Lys Trp Val Gin Leu Phe Ser Thr Pro 
190 195 200 205 

TAC TTC ATG GTC TCA GCT GTAAGTTGCG ATAAAACCTT ACGTTTTGCT 
Tyr Phe Met Val Ser Ala 

210 



2601 



2652 



2712 
2760 



2808 



2856 



2972 



3020 



AATAGAACAC AATGCTAGAA ACTCCCAGAT TTCAATGTTA TGTATTTTGG TGCCCAAAGA 2916 

AGCAACTTCT TAACATCTGT GGCTC CTCTT ACTGACAAAA ATAG GTT GAA GAT GTT 

Val Glu Asp Val 

215 

GAA GGA GTA GAA CTT TGT GGA ACA CTG AAG AAT ATC GTG GCC ATA GCA 
Glu Gly Val Glu Leu Cys Gly Thr Leu Lys Asn He Val Ala He Ala 

220 . 225 230 

GCC G GTTCGTGTTT ACGAGATGTA CATTTATGTA TAACAATCTT TCATTTATTC 3074 

ATCGAGATGG GATGCAATAT ATCAATGAGA GGGAAAAGAA AGGGCAAAGG AAAATGCTGT 3134 

TGTATTGCAG CTTTAGGCAT TCTTTTCTCT TAATTATTAA CTGTGAAACA CCGAGAAGTA 3194 
TTGATGAAGT TAAGAAACGA TGTTACAG GT TTT GTG GAT GGA TTG GAG ATG 3245 
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. Gly Phe Val Asp Gly Leu Glu Met 

23 5 240 

GGA AAC AAC ACA AAA GTAAGTCTAA ATTTTTTGTA AAACTTAAAG TAAGAGTTTA 
Gly Asn Asn Thr Lys 

245 

TGCTTTGGCA TTGTTTGAAG TTCACTTACT AATGACTTTA G GCA GCA ATT ATG 

Ala Ala lie Met 

AGG ATC GGG TTA CGG GAG ATG AAG GCA TTC TCC AAG CTT TTG TTT CCA 
Arg He Gly Leu Arg Glu Met Lys Ala Phe Ser Lys Leu Leu Phe Pro 
250 255 260 265 



TCT GTT AAG 
Ser Val Lys 



GAC ACT ACT TTC TTC GAG AGC TGT GGA GTT GCT GAC CTC 
Asp Thr Thr Phe Phe Glu Ser Cys Gly Val Ala Asp Leu 
270 275 280 



ATC ACA ACT TGT T GTAAGGAAGC ATATAGATTT CCTTCGAATA TGAATAAATT 
lie Thr Thr Cys 

285 

GCATAGTTCA TATCATCATA ATTTGTGTTT GTGCTCAG TG GGC GGG AGA AAC 

Leu Gly Gly Arg Asn 

290 

AGA AAA GTT GCT GAG GCT TTT GCA AAG AAT GGC GGG GAA AG 
Arg Lys Val Ala Glu Ala Phe Ala Lys Asn Gly Gly Glu Arg 

295 300 

GTCGTGTTTC CCTTTCGTCG ATCCTGATTT AATTCCTGTT TAGTGGTATT CACTTTGTGT 
GTATGTAAAT CAAGCAACTA TTTCCATCAT CTTCAG G TCA TTC GAT^GAT CTC 

Ser Phe Asp Asp Leu 

305 

CU, fZ* ^ CGG GGG CAA AAA TTA CAG GTACATGATG AAGAAACCGA 

Glu Ala Glu Leu Leu Arg Gly Gin Lys Leu Gin 

310 315 320 

TGTCTATACA GAAAGAGTCC ATTGCAAAGC TTGAGAATGT TTCGAGCATA AAGAGCATAA 

GAATATTCTT TTCGGTGATT TTCATGCAG GGT GTC TCA ACA GCA AAG GAG GTC 

Gly Val Ser Thr Ala Lys Glu Val 

325 

TAT GAA GTC TTG GGG CAC CGA GGC TGG CTC GAG CTG TTC CCG CTC TTC 
Tyr Glu Val Leu Gly His Arg Gly Trp Leu Glu Leu £e Pro Su lit 
330 "5 3 40 

sS Jhr Val ? C S° JT C GGC CGT CTG CCT TCA G ^ ATC 

Ser Thr Val Hxs Glu He Ser Thr Gly Arg Leu His Pro Ser Ala He 

350 355 360 

GTC GAA TAC AGC GAA CAA AAA ACC ATC TTC TCT TGG TAGAGCAAGA 



3300 



3353 



3401 



3449 



3502 



3554 



3595 



3655 
3707 



3760 



3820 
3873 



3921 



3969 



4015 



33 



I 



2170611 



Val Glu Tyr Ser Glu Gin Lys Thr He Phe Ser Trp 

365 370 

GGCTGCCCTT GAAAGACTAA GAGCCACCCT GCCCTGTTTA AAGGGCTAAA AGTTTAATAT 
TTCTCTGCAG CCTAAACAGT TGGAAACATT GAAAATCTAG GATGTATCAG AAAAAAGAAG 
GTTTGGAGGA AGTATGGATG ATATAGAGGA CATGAATGTA TTCATTTTCG GTATACTCTT 
CTGCAAA ATAATTCTTC AGATGTTTTT GTGGTATGAG ATATAGAGGA CATGTATGTA 
TGCGGTAAGG CTGAAGTAAA CAAGTTACCA TAAGAGACAG CCCTCTCGGT TTCTTCCATC 
TGATCGATTC GTCTCGTCGA ATTTGCCAAA AGCTCAAAAC TCAACTCATC CCCTGCTTTC 
TATCCATATG GGCAAGGAAT ACAATTAGAC CAGTTTGATA CTTGTAATGA GAAGTTTAC 



4075 
413S 
419S 
4255 
4315 
4375 
4434 



(2) INFORMATION FOR ID SEQ NO: 6 



(i) 



SEQUENCE CHARACTERISTICS 



(A) 



(B) 



LENGTH: 



2955 base pairs 



TYPE: Nucleic acid 



(C) STRANDEDNESS : 



Double strand 



(D) TOPOLOGY 



Linear 



(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: No 



(iv) 



ANTI- SENSE: No 



(vi) 



ORIGINAL SOURCE 



(A) ORGANISM: Cuphea lanceolata 



(vii) IMMEDIATE SOURCE: 



(ix) 



(A) LIBRARY: Genomic lambda FIX II 

( B ) CLONE : ClGPDHg3 



FEATURE: 



(ix) 



(A) NAME/KEY: CAAT signal 



(B) LOCATION 



FEATURE: 



1055 to 1058 



(ix) 



(A) NAME/KEY: TATA signal 



(B) 



LOCATION 



FEATURE: 



1103 to 1107 
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(A) NAME/KEY: Start codon 

(B) LOCATION : 1182 to 1184 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: Join (1182 to 1326, 1837 to 1913, 2010 to 

2082, 2180 to 2397, 2480 to 2587. 2G68 to 
2731, 2848 to 2885, 2947 to 2955) 
< xi > SEQUENCE DESCRIPTION: SEQ ID NO: 6 



GGATCCTCCT CGATGGTGGT CCAATGAAGA CTATACAAAA CCAAGCCGAC GGAATCCGGT 60 

GCACAATAAC TTGAAGCCAT GAAAACCAAT GCAATATATA GAGTACGCCT TGTACTATGT 120 

AATATATTTA CAATTTTCTC TTGAATAGTT TAGGTTTGGT GATCGTAAAC TCGCAAAACA 180 

CATATGTGCG TGTGTAA . ATA TATCTGGTGA TGATGTATGA AGAGAGTGCG GTTTAATTAC 240 

CCGGTATTGT ATAAGGTTGT ATCTGCAGTT GACACTTTCA GTAGAAATTA CTAATAACTC 300 

GACGAGATAC AAACGACTCG AGTTTCAGAA ATAAGTGGCA AAACGTTATG GGGTTCTCCT 360 

TGATTCTTCG TGGAAGGTAT ACTATTAATC ATGTTCGCCT CCGTCCTAGT AGAAACATAG 420 

AGTTTTTATC GGGATGCAGA TTGCAGATGA TAGAACTATT GTCAGATTCA TTATGCATAT 480 

AGGATAGGCC TTCTACTGAT TTGGAAACTT ATATCGATTC TGTTGGAATG GATGTATGAA 540 

AAGCTTCATA TCCGACATTG AAAATTTGGT CATATCAATA AGATGAACTA ACAAAATATG 600 

CCAACCTCTT GGAAGCAAAA CACATCCGAG ACTTTAAGAT GTGGCTGAGG TTTCTGCAAC 660 

TTTAAATCTC CCATATGCTT GACAGAATTG GTAGACCTAA CTCAATGGAT TTCATTCAAT 720 

GATCGAAGTT TCTCTATCGA TCATAGCTGT GAATTAGTAA GCAAATGTCC ATAATATATC 780 

CCCGAAAACA CGTAAAGTTA GGTCTCATTA CATTAGGCCT CAACCATATG TTATAAGTAA 840 

ATTTGTTTTT TTTTTTTTCT CTTACAGTTG AATGTATCAA ATCGAAAAAA CCGTTAAGTC 900 

GTTGCGGCCC TTTGAATAGT AAGCCAAAGA TCCGAAAGAA AAAGTAAACA GAGACAGAGC 960 

AATGAGGAGA TGGCCAGTTT GAGAAGCAAA CGCATAGGTT GCCACGGAGG AGGCGGAGAC 1020 

GGGTCATCGA TGACTTTCTC CGCCTCCTTA ACCGCAATGG CGATGCCGCC ATACCTCTCT 1080 
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GTCACCCTCT CTCCATTCCC TTTATATCTC TCCCGCTTCT TCCTCTGCTC CACTCAACCC 1140 

CCTCTGCATA AACTCTGTGC TTTTTTAGTC TCTCCCCTGC T ATG TCG CCG GCA 1193 

Met Ser Pro Ala 
1 

TTC GAA CCC CAT CAG CAG AAG CCT ACC ATG GAG AAC ATG CGA TTC CGA 1241 
Phe Glu Pro His Gin Gin Lys Pro Thr Met Glu Asn Met Arg Phe Arg 
5 10 15 20 

GTC ACC ATC ATT G4GC AGC GGT AAC TGG GGC AGC GTC GCC GCT AAG CTC 128 9 
Val Thr He He Gly Ser Gly Asn Trp Gly Ser Val Ala Ala Lys Leu 

25 30 35 

ATT GCC TCC AAC ACC CTC AAC CTC CCG TCT TTC CAC G GTTTGTCTGC 133 6 

He Ala Ser Asn Thr Leu Asn Leu Pro Ser Phe His 

40 45 



CACTCTTCTT TCTTCATGAT CAGGCTCTTG CCAGTAGAGA CATGTCTTTT CATGAATCAA 1396 

GCACCCGTTT TTTCGATGAG GATCACTGAG TTTGATTTAA GGGTATCCGA TGCAACTGCT 1456 

GAAAAGATGT GGTTATTTTT GTTCTTTCAT GAAGTATCAT CTGAGAAATT TGATCTTAGC 1516 

CTAAGCGGCA TTACTTTCGG TGTTAAGTTC ATTCTATGTG AGTAGGAGTA TGAGGTGATG 1576 

CCGCGTGATT CCAATCAGGT ACCGATGAAA ATCAGTAGAC ATGGTTGCAG TTGAGGTTCC 1636 

ATAGTTTACA CAGCATAGGA GTTGCTGTAT TTCTATTGAC GCTTGGATTT GTTTGGTGCT 1696 

TATAATCCCG GTTTTTACTA ATTGGTTATG AACACCGATA ATAACAACAG TTAGATTTCT 1756 

TCAACATTAA CCGGTTGAAG ATT AGG C CAT ATTCTTATTT GGGTACTATT TCTTAAGAAA 1816 

ACATTCATAT TTTCTTTCAG AT GAA GTA AGG ATG TGG GTG TTT GAG GAG 1865 

Asp Glu Val Arg Met Trp Val Phe Glu Glu 
50 55 

ACA TTG CCA AGC GGC GAG AAG CTC ACT GAA GTC ATC AAC CGG ACC AAT 1913 
Thr Leu Pro Ser Gly Glu Lys Leu Thr Glu Val He Asn Arg Thr Asn 
60 65 70 

GTAAGGAAGA TCAATTTAGC ATGTCATTGT ATTAACATAA AGAGCGTTTA TTGGCAACTT 1973 

TGGCTTTCAT GATGTTCGAG TGTTGCGTCT TTG CAG GAA AAT GTT AAG TAT CTG 2027 

Glu Asn Val Lys Tyr Leu 
75 80 

CCT GGA TTC AAG CTT GGC AGA AAT GTT ATT GCA GAC CCA AAC CTT GAA 2 075 

Pro Gly Phe Lys. Leu Gly Arg Asn Val He Ala Asp Pro Asn Leu Glu 

85 90 95 
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AAT GCA G GTAGTGATTG TATTTCAGTG CTCGGTTGAA TGATCAAGTA AAATCCTCGT 2132 
Asn Ala 

GCTAAATATG TCGAGATGTT CGTGTTTTTG CATAATGTTT TGTTTAG TT AAG GAA 2187 

Val Lys Glu 
100 

GCA AAC ATG CTT GTA TTT GTC ACA CCG CAT CAG TTC GTG GAG GGC CTT 2235 
Ala Asn Met Leu Val Phe Val Thr Pro His Gin Phe Val Glu Gly Leu 

1°5 . 110 us 

TGC AAG AGA CTC GTC GGG AAG ATA AAG GCA GGTGCA GAG GCT CTC TCC 2283 
Cys Lys Arg Leu Val Gly Lys He Lys Ala Gly Ala Glu Ala Leu Ser 
120 125 130 

CTT ATA AAG GGC ATG GAG GTC AAA AGG GAA GGG CCT TCC ATG ATA TCT 2331 
Leu He Lys Gly Met Glu Val Lys Arg Glu Gly Pro Ser Met He Ser 
135 140 145 

ACC TTA ATC TCG AGC CTT CTC GGG ATC AAC TGC TGT GTC CTA ATG GGA 2379 
Thr Leu lie Ser Ser Leu Leu Gly He Asn Cys Cys Val Leu Met Gly 
150 155 160 165 

GCA AAC ATC GCC AAC GAG GTAAAATCTT GGTGCAGTCT TACGAGATTC 2427 
Ala Asn He Ala Asn Glu 

170 

TGAATCTTGA ACCTGTTAGC ATTTTGACAC ACTGTGACTT CTAAATTTGT AG ATT 2482 

He 

GCT CTT GAG AAA TTC AGT GAG GCG ACA GTC GGA TAC AGA GAA AAT AAG 2530 
Ala Leu Glu Lys Phe Ser Glu Ala Thr Val Gly Tyr Arg Glu Asn Lys 
I 75 150 185 

GAT ACT GCA GAG AAA TGG GTT CGG CTC TTC AAC ACT CCA TAC TTC CAA 2578 
Asp Thr Ala Glu Lys Trp Val Arg Leu Phe Asn Thr Pro Tyr Phe Gin 
190 . 195 200 



GTC TCG TCT GTGAGTACGA ATAAACCTTT CCTTCTGCGA ACAAAAAACT 

Val Ser Ser 

205 



2627 



TCCCGAGGCA GGAACTAAAT GAAACAAGTT AACATAATAG GTT CAA GAT GTG GAA 2 682 

Val Gin Asp Val Glu 
210 

GGA GTG GAA CTT TGT GGC ACA CTG AAG AAT GTC GTG GCC ATA GCA GCC G 2731 
Gly Val Glu Leu Cys Gly Thr Leu Lys Asn Val Val Ala He Ala Ala 
215 220 225 

GTACTTATAT ACGATCTCCA CATTTATATA AACTAGTTAG AAAGATTTTG GATTGCTGTA 27 91 

AAAACCGTGG AAAAACCCGA AAAGTGTTGA TGAAGTGTTA CCAAATGTTG TTTCAG GT 284 9 

Gly 
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TTT GTA GAT GGA CTG GAG ATG GGA AAC AAC ACA AAG GTAAGTCCAA 
Phe Val Asp Gly Leu Glu Met Gly Asn Asn Thr Lys 
230 235 240 

AGTTCATGCA AATTTTTTCG TATTTACGAC TGAATGCTTG GATATACATA G GCT GCG 

Ala Ala 

ATT 
lie 



(2) INFORMATION FOR ID SEQ NO: 7 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 574 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS : Double strand 

(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: UNA (genomic) 

(iii) HYPOTHETICAL: No 

(iv) ANTI-SENSE: No 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Cuphea laneeolata 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: Genomic lambda FIX II 

(B) CLONE : ClGPDHg3 
<ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 31 to 189 * 
(ix) FEATURE: 

(A) NAME/KEY: Stop codon 

(B) LOCATION: 190 to 192 
(ix) FEATURE: 
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(A) NAME/KEY: PolyA signal 

(B) LOCATION: 393 to 398 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7 

GGCATATCGA TGATTTTTCC TATCTTGCAG GOT GTC TTG ACA GCA AAA GAG GTG 

Gly Val Leu Thr Ala Lys Glu Val 
1 S 

TAT GAG GTA CTG AAG CAC CGG GGC TGG CTC GAG CGT TTC CCG CTC TTC 
Tyr Glu Val Leu Lys His Arg Gly Trp Leu Glu Arg Phe Pro Leu Phe 
10 15 20 

GCA ACT GTG CAT GAG ATC TCA TCT GGC AGG TTG CCT CCT TCA GCC ATT 
Ala Thr Val His Glu He Ser Ser Gly Arg Leu Pro Pro Ser Ala He 
2S 30 35 4 o 

GTC AAA TAC AGC GA-A CAA AAG CCC GTC TTA TCT CGA GGT TAGAACGAGA 
Val Lys Tyr Ser Glu Gin Lys Pro Val Leu Ser Arg Gly 

45 50 



54 



102 



150 



199 



379 
439 
499 



GAAAACCCGA CAAACCGGTG AAACTCGTAG TCTTAAACTG AAATCCAAAA ACATGCTGGG 259 
AACATCAGCA AAAACCATTC ATCAAGGATG TCTTAGATAA AAGGTTTCAG GAAGAAATAG 319 
ATGGTAGTGT GTGTAATGTT ATCAGCAATC ATTCATTCAT TTATTAAGTA TTTTTTGCAT 
CATATTTTAT GCTAATAATT ATTACATAAA TTACTCAAAT TTTGTCAAAA TTTCTGCATT 
GCCCCAAACA GATTAATGCA TTGAGAAAAA CTTATAAAGC TTTATCCAGC ATACATATAG 

I^STI!^ AATACAAAAA CACCCTTCTA AGCCTCTTTG AAGATGGAGT TTGATCACAC 559 
ATTAAAATGC TTTTT 3 " 

574 

(2) INFORMATION FOR ID SEQ NO:8 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1507 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS : Double strand 

(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: DNA (molecular) 

(iii) HYPOTHETICAL: No 
< iv > ANTI- SENSE: No 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Cuphea lanceolata 
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(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: Genomic lambda FIX II 

(B) CLONE :ClGPDHg9 
(ix) FEATURE: 

(A) NAME/KEY: TATA signal 

(B) LOCATION: 1108 to 1112 
(ix) FEATURE: 

(A) NAME/KEY: Start codon 

(B) LOCATION: 1193 to 1193 

(ix) 

FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1193 to 1376 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8 



GCATGCGGGC AGGCAGGCAG GCATGGGTCT AAATTCTAGA AGACCCAGAC ATATTCATTT 
TGTTCACAAC CGACCCATCA ATATATTGAT TAATTTTGTT TAAATTTATC ATCAGTTTTT 
ATTTAATATT TTTAAATAGG TTTACCTTGA TCGTGATAAT TATTTAATAT TACTTTGTAA 
TAGTTTATTT ATCTAGCGTT ATAAAATAAC ATTTGAATTC GTTGATGATA TGTGTATTTT 
TACTATGTTT ATATGAAATT TATATTTCAA ATATTAAATA ATGTTCTTAT TTTGGCCTAT 
GGAGAAGTAT CATCAATTTT TCTATTAAAT AACAGTCTTC AGTTTAGTCA AATCAGTTGA 
TAAGTTCCCA AATCACACAT TGTTTGTATG AAAATTTTAA TAAAAAAGTT AAGATGGTAT 
TATTATAGAA AAATATATAA AGTATCTTTA AATAATAATT TCTTTTTAAT ACAAAAAGGA 
ATATTTGATT ACTTGACTTA TAAAATTTAT TGATAAGGAT GCCAACTTTC ATTTTAGAAA 
CTAGAGTAAT GATGGTTAAA TTCCCCGAAA AATGGTATGT CAATTTATTG ATACGTTCCA 
CTACTATTT CTGAGACATT TACATGTTTG TAAAAAAAAT CTATATATTT AAATTAAGAT 
GGGTGTAATC AATTATAAAA TACAGCGAAT TTTAACACCG AATGAATAGA TTATCTGCAT 
AACAATTTAT ACCATCCCTA AATACGAATT AGCAAGTTAA TAAAATTTAA TTACACGAAC 
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120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
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CATGATTATA TAAATTATCG AATCCCCGAC GTGGGGACGT ACCGAACCAA CCGTTGAAGT 
GGTTGCCCTT TGAATCCTAA GACATACAGA CGTCATGATT CTTTGTCTCT CTATCTGTCC 
ATTTACATAA TAAAATCAAA GAGAAGAAAA CAGAGGAAGC AGAGCATAGC ATAGCATAGC 
ATAGAGGAGA TCGCCAGATT CAGCTGTTTC CTCATAGTTT GCCACGAGAC ATACATTGCA 
TTGCCCGATG CCTTTCTCCG CCTCCTTGTC CCTCTCCTCA TTCCCCCGAT GCCTTTCTCC 
GCCTCCTTGT CCCTCTCCTC ATTCCCTTAT ATCCCTCCTC CCCTCCCTCT TCTTCCTCTG 

CTCAACTCCT CCCCCTCACC CTCTTCCTCT OTTCTTCCTC TCTGCCTCTG CA ATG 

Met 
l 

GCG CCT GCC TTC GAA CCC CAT CAG CTG GTT CCT TCT GAG CTT AAC TCT 
Ala Pro Ala Phe Glu Pro His Gin Leu Val Pro Ser Glu Leu Asn Ser 

5 10 15 

GCC CAC CAG AAC CCA CAT TCC AGC GGA TAT GAA GGA CCC AGA TCG AGG 
Ala His Gin Asn Pro His Ser Ser Gly Tyr Glu Gly Pro Arg Ser Arg 
20 25 3 0 

GTC ACC GTC GTT GGC AGC GGC AAC TGG GG4C AGC GTC GCT GCC AAG CTC 
Val Thr Val Val Gly Ser Gly Asn Trp Gly Ser Val Alf^a Lyt jZ 
35 40 45 

ATT GCT TCC AAC ACC CTC AAG CTC CCA TCT TTC CAT G GTTAGTCTCT 
iJ.e Ala Ser Asn Thr Leu Lys Leu Pro Ser Phe His 
50 55 60 

5?f? TCATGGAATA GTCTCTAGAC ATGAGCCCCT 

GTTTGCATGG TTTTGTTTTG TCTTTGAAAC ATGAATAAAG GTGGTTTCTT GTGTTGGTAC 
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Patent- Hfjim?? 

DNA sequences which are isolated from plants and code for a glycero- 
phosphate dehydrogenase, and the alleles as well as derivatives of these 
DNA sequences. 

DNA sequences according to claim 1, wherein they are isolated from Cuphea 



Genomic ciones which are isolated from genomic plant DMA and contain a 
complete gene of a glycerol -3 -phosphate dehydrogenase and the alleles as 
well as derivatives of this gene. 

Genomic clones according to clain, 3, wherein the complete gene contains 
the promoter sequence and other regulator elements in addition to the 
structure gene. 

Genomic clones according to claim 4, wherein the plant DNA originated' 
from Cuphea lanceolata. 

Promoters and other regulator elements of the glycerol- 3 -phosphate gene 

from one of the genomic clones according to claims 3 to 5 , and the 

alleles as well as the derivatives of these promoters. 

DNA sequences according to claim i • ^ 

s to claim 1, obtained from functional 

complementation with mutants of a microorganism. 

DNA sequences according to claim 7 wherein m A 

a v.xaim /, wnerem the microorganism is E. coli 

BB26-36. 

Procedure for producing plants, plant parts ^ ^ ^ 
triglyceride content or fatty acid pattern of which i s altered, in 
connection with which a DNA sequence is transferred according to one of 
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claims 1 or 2. or a gena originating from tna ganonio clone, according to 
one of claims 3 to 5 is transferred by genetic engineering methods. 
Procedure according to claim 9, wherein the DNA sequence or the gene is 
transferred by microinjection, electroporation, particle gun, steeping of 
Plant parts in DNA solutions, pollen or pollen tube transformation, 
transfer of corresponding recombinant Ti plasmids or Ri plasmids with 
Agrobacterium tumefaciens, liposome-mediated transfer, or by plant 
viruses . 

Use of a DNA sequence according to one of claims 1 or 2 or of a gene 
originating from the genomic clones according to one of claims 3 to 5 for 
altering the biosynthesis output in plants. 

Plants, plant parts and plant products produced according to a procedure 
of claims S or 10. 
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